I will be discussing the insights and findings of my analysis in this report
First insight I checked was if there was an correlation between retweet_count and favorite count.As presented by the graph there is positive correlation between these two variables.The higher the retweet count , the higher the favorite count
1st figure
2nd figure
3rd figure
From these three figure above, I was comparing the percentage of each dog_stage in prediction column. The first figure is the first prediction and it has high precision with six of the dog stages having an above 80% prediction accuracy rate , and notice that its the only figure with a 100% or 0.1 scale on the y-axis. Figure 2 and figure 3 go as hig as 0.5 and figure three being the lowest at 0.25.Figure one has a high accuracy prediction
1st figure
2nd figure
3rd figure
In this instance I wanted to measure how true or false values compare bewteen the three predictions and which one has a high true and false value as well as which one has a low true or false value. According to the chart above there's little disparity bewteen the false value with the first prediction having 500 values as false , whilst the 2nd prediction has the lowest of the three just below 500 and the third having the highest false value jjust edging above 500.This means the 2nd prediction has the lowest false value percentage rate. In terms of the true values percentage the highest is the 2nd prediction. By virtue of both these , we could conclude that the second predition is the most accurate out of three predictions
1st figure
2nd figure
In these two figure I wanted to get a feel of the mode and max and mix values.In the first figure I wanted to know the number of times a specific dog_stage appears in the group so that I know which dogs of which dog stage were most occuring. The data is poor because these more than 60% unkwown dog_stages but putting that aside, the dog stage that appeare the most is the pupper.
The second graph has all the data about the ratings provided. The most common rating given was 12/10 followed by 10/10 and 11/10 as the data shows.
1st figure
2nd Figure
The first figure tries to find the correlation between favourite count and the rating involved. To see if having the higher number of favorite count means the rating is going to high as well. and the data shows a fluctuation showing us that there is no correlation.
The second figure shows a comparison of the ratings by year. so each rating has been recorded as to how many times it showed in the data. ie the most popular rating by each year and in 2015 it was 10/10 and in 2016 it was 88/80 and 2017 it was 99/90. The three highest ratings come from the year 2016
1st figure
2nd figure
At this stage i wanted to see which month had the highest favorite count and retweet count and in both instances it was the month of June